Non-blocking minimum processes coordinated checkpointing for hierarchical computational grid

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Coordinated Checkpointing Protocol

Coordinated checkpointing protocol is a simple and useful protocol, used for fault tolerance in distributed system on LAN. However, checkpoint overhead of the protocol is bottlenecked by the link speed. Checkpoint overhead of the protocol increases even if only one link in the network is of low-speed. In a metacomputing environment, where distributed application communicates over low speed WAN,...

متن کامل

Blocking and non-blocking coordinated checkpointing for large scale MPI computation

Nowadays, clusters and grids are made of more and more computing nodes. The programming of multi-processes applications is the most often achieved through message passing. The increase of the number of processes implies that theses applications need to use a fault tolerant message passing library. In this paper, we present two implementations of fault tolerant protocols based on MPICH, a blocki...

متن کامل

Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols

A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant programming environments should be used to guarantee the safe execution of critical applications. Research in fault tolerant MPIs has led to the development of several fault tolerant MPI environments. Different approaches a...

متن کامل

Minimum Process Coordinated Checkpointing Scheme for Ad Hoc Networks

The wireless mobile ad hoc network (MANET) architecture is one consisting of a set of mobile hosts capable of communicating with each other without the assistance of base stations. This has made possible creating a mobile distributed computing environment and has also brought several new challenges in distributed protocol design. In this paper, we study a very fundamental problem, the fault tol...

متن کامل

A Minimum-Process Coordinated Checkpointing Protocol For Mobile Distributed System

While dealing with Mobile Distributed systems, we come across some issues like: mobility, low bandwidth of wireless channels and lack of stable storage on mobile nodes, disconnections, limited battery power and high failure rate of mobile nodes. These issues make traditional checkpointing techniques designed for Distributed systems unsuitable for Mobile environments. In this paper, we design a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The International Conference on Electrical Engineering

سال: 2012

ISSN: 2636-4441

DOI: 10.21608/iceeng.2012.32714